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Abstract 

An efficient, iterative semi-implicit (SI) numerical method for the time integration of stiff wave systems is presented. 
Physics-based assumptions are used to derive a convergent iterative formulation of the SI scheme which enables the 
monitoring and control of the error introduced by the SI operator. This iteration essentially turns a semi-implicit 
method into a fully implicit method. Accuracy, rather than stability, determines the timestep. The scheme is second- 
order accurate and shown to be equivalent to a simple preconditioning method. We show how the diffusion operators 
can be handled so as to yield the property of robust damping, i.e., dissipating the solution at all values of the 
parameter VAt, where 2? is a diffusion operator and At the timestep. The overall scheme remains second-order 
accurate even if the advection and diffusion operators do not commute. In the limit of no physical dissipation, and 
for a linear test wave problem, the method is shown to be symplectic. The method is tested on the problem of Kinetic 
Alfven wave mediated magnetic reconnection. A Fourier (pseudo-spectral) representation is used. A 2-field gyrofluid 
model is used and an efficacious k-space SI operator for this problem is demonstrated. CPU speed-up factors over a 
CFL-limited explicit algorithm ranging from ~ 20 to several hundreds are obtained, while accurately capturing the 
results of an explicit integration. Possible extension of these results to a real-space (grid) discretization is discussed. 

Key words: Semi-implicit methods, robust damping, magnetic reconnection, gyrofluid, dispersive plasma waves, symplectic 
methods 
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1. Introduction 

One ubiquitous property of the dynamics of fluids and plasmas is their inherent multiscale character, both 
spatial and temporal [l|, I4I . This translates into a stiffness of the partial differential equations (PDEs) that 
are used to describe the physical systems of interest. The numerical solution of stiff systems of coupled PDEs 
is notoriously difficult, presenting a serious challenge to present day computational resources. 

Wave-induced temporal stiffness occurs when the characteristic time scale of interest is much slower than 
that of the fastest waves present in the system. Fortunately, however, it is often the case that it is not 
necessary to accurately capture these waves in order to obtain accurate solutions of the slowly evolving 
physics d, 3 . Examples are the gravity wave in weather modeling 0, 0| , or the Alfven and whistler waves in 

* Corresponding author 

Email addresses: loureiroOpppl.gov (N. F. Loureiro), hainmettOpppl . gov (G. W. Hammett). 

URLs: http://www.csccmim.umd.edu/cmpd/people/loureiro/ (N. F. Loureiro), http://w3.pppl.gov/~haimnett/ 
(G. W. Hammett). 



Preprint submitted to Journal of Computational Physics 



1 February 2008 



studies of some instabilities in plasmas 

0, Si, HE El, [13 • When that occurs, implicit and semi-implicit (SI) 
numerical methods have been employed as very powerful tools to potentially allow the stable and accurate 
time integration of the PDEs on timescales which largely exceed those imposed by these high frequency, 
fast waves (see, for example, the paper by Knoll et al. [13| and references therein). More conventional and 
easy to implement explicit numerical schemes arc limited by the well-known Courant-Friedrichs-Lewy (CFL) 
condition and thus, in order to be stable, require integration timesteps of the order of the inverse of the 
maximum frequency present - a very severe, and generally unfeasible, requirement. 

There are advantages and disadvantages to both purely implicit and SI numerical methods. Implicit 
and SI methods can both use time steps much larger than the CFL limit. Implicit methods are more 
robust and accurate but can be harder to implement, in some cases requiring more extensive modifications 
to the structure of an existing code. Direct solution of an implicit scheme can require the storage and 
inversion of impractically large matrices (though direct sparse matrix solvers with reordering can be effective 
for some problems that are not too large). Thus implicit methods often employ preconditioned iterative 
methods (such as Krylov methods like Conjugate-Gradients or GMRES), sometimes in combination with 
multigrid or FFT solvers (see for example references l3, 3], which also discuss nonlinear problems using 



Jacobian-Free Newton-Krylov solvers). To obtain a net speed up, it is often crucial to have a sufficiently good 
preconditioner that can still be efficiently inverted. While there are many options 14, 13, 0, [l^l, finding 



such a preconditioner is problem-dependent and can be a non-trivial task. If no preconditioner is used, a 
Krylov iterative solver applied directly to a hyperbolic wave/ad vection problem will usually give at most an 
order unity speed up in net CPU time, because the larger time step allowed by an implicit method is largely 
offset by the increased number of iterations required per time step. However, when a good preconditioner 
exists, or a multigrid or FFT solver can be used, then significant savings in CPU time are possible. 

SI methods are usually simpler to implement and work well for some problems. However, they have to be 
used with some care, because in some parameter regimes there can be splitting errors (due to the different 
treatment of the operators involved) and/or uncontrolled approximations to the equations that are made in 
their derivation (such as linearization of the fast wave physics). The reader is referred to the work of Knoll 
and Keyes and Knoll et al. for an overview of SI methods in a variety of applications and their 
comparison to the JFNK approach to fully implicit methods . 

In this paper, we revisit SI methods as they are traditionally formulated in the field of plasma physics 17 , 

0, @, [l3|- Using physics-based assumptions, we are able to consistently derive an iterative formulation of SI 
schemes, which lends itself to an efficient way of monitoring and controlling the error introduced by the SI 
operator. Accuracy, rather than stability, determines the timestep. It is shown that this iterative extension 
of an SI scheme leads to essentially a fully implicit method with the SI operator providing a preconditioner. 
We also demonstrate how the diffusive terms can be treated so as to yield the property of robust damping, 

1. e., the consistent damping of all the modes present in the simulation domain in the limit A< oo. The SI 
method presented here is second-order accurate even if the diffusion and nonlinear advection operators do 
not commute. 

The method is then applied to the problem of gyroffuid magnetic reconnection. This is a very stiff appli- 
cation due to the presence of a high frequency dispersive wave, the Kinetic Alfven wave (KAW) . A Fourier- 
space, pseudo-spectral, discretization of the equations is used. Compared to explicit integrations schemes, 
our SI scheme yields CPU speed-up factors of ~ 20 to several hundreds while accurately reproducing the 
results of an explicit integration. 

Before proceeding further, we caution the reader that, although our derivation of the SI algorithm pre- 
sented in section 12.11 is general, we have only tested it in a pseudo-spectral implementation, and only for 
a 2-field problem that involves a single wave family (a single dispersion relation ±uj vs. k). The efficiency 
of this method relies heavily on the effectiveness of the SI operator used, which may be complicated for 
problems with strong spatial inhomogeneities, strong anisotropics, or for multi-field problems with multi- 
ple waves. Also, our treatment of diffusion terms with a robust damping algorithm has been tested only 
in a Fourier representation, although we discuss, in the conclusions section, how a similar result might be 
achieved by grid codes. While we believe this SI algorithm could be useful in real-space codes at least for 
certain types of problems, the reader interested in other applications should proceed with caution. 

Finally, we note that although our focus is in the plasma physics field, the derivation of the iterative 
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SI method presented is general and should apply well to other stiff-wave physics phenomena that can be 
described by means of advection-diffusion PDEs. 



2. Numerical scheme 

2.1. Derivation of the method 

As mentioned above, we are interested in dealing with coupled sets of two-dimensional advection-diffusion 
partial differential equations. Let us consider the following set as our general model: 

^ = .F(0,^) + (7?v2-,y^v4)V., (1) 
^ = g{cb,ij) + {^w'-„H^^)^- (2) 

Here, '0 are arbitrary scalar fields, the operators JF, C/ represent the nonlinear advection-like operators, r] 
and rjH are the resistivity and hyper-resistivity and i> and vh the viscosity and hyper-viscosity, respectively. 
In Fourier space, we can define the diffusion operators as: 

^ ijk"^ + rjEkj^, Vy^vk^+vuk]^ (3) 

and thus rewrite the above set (in Fourier space), as: 

^=T{cf>,ij)-V^i,k, (4) 
^=g{c^,^)-V,<p,. (5) 

Here, the k subscripts denote variables in Fourier space and JF, Q represent the Fourier transforms of the 
nonlinear operators J^, Q. For notational simplicity, we will drop both the subscripts and the over bars: it is 
understood that we are working in Fourier space. We note also that above we have chosen a fourth order 
operator to represent the hyperviscosity, but the following derivation holds irrespective of the order chosen 
for this operator. 

The diffusion terms can be integrated analytically by means of the following variable transformations: 

i/, = e-^"*i^, 4) = e-^"*(^. (6) 

Equations ([3HS1) thus become: 

= e^"*.F((/.,0), (7) 



dt 



■ e 



Q{'P,i^)- (8) 



dt 

We now apply a Crank-Nicolson discretization scheme (spatial indices omitted for simplicity): 

=i.F(0",^") + le^'.^*.F(0"+\V'"+^), (9) 



At 2 ' 2 



^g{r,r) + -e^"^*e(0"+\ 0"+i). (10) 

(Without loss of generality, we have defined i„ = 0, so tn+i = At.) 

Note that the handling of the diffusion terms through the variable substitutions of ([S]) is a trivial step for 
spectral and pseudo-spectral codes, but more complicated in real-space codes. However, the derivation of 
the semi-implicit method that follows holds irrespective of how the diffusion terms are handled - this is only 
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important for the robust damping property, to be discussed later on in section 12.61 In section [5] we briefly 
discuss how a similar treatment of the diffusive terms can be achieved in real space codes. 

Usual implicit schemes would now invert equations (p MlOp to determine the unknowns ^/'"+^, 0""*"^ in 
terms of the known quantities ip"', 0". This may be done by, for example, an iterative Jacobian-Free Newton 
Krylov solver. A Newton method turns this nonlinear root problem into a series of linear problems. These 
linear problems can often involve very large sparse matrices. If a direct inversion was done, the very large 
timestep enhancements that can be obtained relative to an explicit scheme might be severely offset by the 
increased CPU time per step. Jacobian-free Krylov iteration algorithms can be used to solve the resulting 
linear problems without explicitly forming the matrices. If these iterative methods converge quickly enough 
(which for hyperbolic problems may require very effective preconditioners), then the implicit method can 
provide a significant improvement in CPU time relative to an explicit algorithm. 

The aim of our method, like more complicated iterative methods, is to circumvent the need to directly 
invert a large linear problem. We will use a simple iterative method, with a semi-implicit operator providing 
an effective physics-based preconditioner. The first step is to reformulate equations ((9l- |10[) in an iterative 
way. To this effect, we begin by rewriting these equations as: 



J,n+l,p+l _ J.n 1 1 
At 



= ic;(^",^") + ie^-'^*Cy((/."+i-^'+\^"+i'^'+i). (12) 



M ' 2 

Next, we introduce one simple, but crucial, physics based assumption: that the stiff waves arise through the 
coupling of the equations and are well described by the assumptions 



J^(V'"+^'^, ?i"+^''') + ^ - , (13) 



; e(V'""^^'P, 0"+i'^') + ^ - V-""*"^'^) ■ (14) 



In other words, we are assuming that the dependence oi J- on ^ is relatively weak (i.e., the fast timescales 
arise via the dependence of T on 0, not on ip) and can be approximated from the previous iteration p, 
while it depends more sensitively on 4> and so we need to keep its dependence on the present p-\-\ iteration 
value by using a Taylor series (and vice-versa for Q). This is certainly true in the linear limit for our 
model, as shown by equations Most wave phenomena involve oscillations between two different 

variables (for example, pressure and velocity for sound waves, or position and velocity for a pendulum), so 
that approximations like the above are often justified. A key step when extending this approach to more 
complicated systems of equations, which might contain multiple waves, is to identify the key terms involved 
in the fast wave dynamics. 

[We have written the Jacobians in the above Taylor-series expansions as functional derivatives, such as 
5J- /5(j)^ since T may be a non-local functional that operates on <f). After spatial discretization, 8J- /Sip will in 
general be a large sparse matrix. However, in our final algorithm all we have to evaluate is an approximation 
to {5!F / 5(p){/S.Q / Sip) . This approximation will be local in Fourier space, and so for simplicity we will replace 
the functional derivative notation with the standard notation dlF /dep. Note that in our final algorithm we 
never actually need to explicitly evaluate the Jacobian, just the action of a Jacobian on a vector, and so this 
formulation is "Jacobian-free" .] 

With this assumption, the nonlinear future time term in equation ([9]) can be simplified by means of the 
Taylor expansion, i.e.: 
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JJn+l.p+l _ J,n I 



At 2 

PIT 1 

(15) 
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An expression for the last term on the RHS of this equation is straightforwardly obtained (for p > 0) by 
taking the difference between equation evaluated for two successive values of p: 

where again we have used the approximations of equation (|13p . Undoing the variable substitution of equa- 
tion equation ([TS]) thus becomes: 

At this point we note that if the above equation is iterated until convergence, i.e., |?/)"+1'P+i — iI;"-+^-p\ < e, 
where e is the prescribed accuracy, the last term becomes negligible. Therefore, it is not important to 
retain its exact functional form as long as convergence is reached at each timestep. We can thus simplify 
equation (|17|) by replacing the last term on the RHS with an analytically invertible approximation. We use: 

where uj is given by the dispersion relation, or an approximation to it. [Specific forms for the operator (2i^ 
will be discussed later, for now it should just be thought of as an operator that is relatively easy to invert.] 
The steps we have just taken are similar to those for some previous SI derivations and share some key 
advantages for high-frequency waves, since it is only the combination {d!F/d(f>){dG /dtp) that has to be 
approximated by the semi-imphcit operator, and not {d!F/d(f)) and {dQ/dijj) separately. As discussed in 
the SI literature, this is much simpler because one does not have to worry about getting the sign of the 
approximation uj correct (i.e., one does not have to worry about the direction of propagation of the waves). 
Chacon and Knoll ll|, [ij, ll3[ have shown how this kind of physics-based semi-implicit operator can be 



related to a Schur complement prcconditioncr, turning an originally hyperbolic problem into a diagonally 
dominant parabolic problem that can be efficiently solved by preconditioned Krylov (if the preconditioner 
is sufficiently effective) and/or multigrid iterative techniques. In some cases, the resulting problem can also 
be efficiently solved with FFTs, which we use here. 
So far, we have: 

'■'^/^^^ ,1,1 

_ ^« + l^P)_ (19) 



4 

^n+i.p+i ^ e-^-^V" + ^e-^'^^*5(0", 7^") + ^g(0"+i'P, V"+''^+'). (20) 

Equations (|19H20[) constitute the basis of our iterative semi-implicit method. Since this iteration converges 
(as we will later show) to the Crank- Nicolson (CN) solution for an arbitrary initial condition (within some 
domain), it is tempting to just use these equations with the initial guess ^/I'^+i-" = 0"+i,o _ ^pn^ However, 
one finds that if one docs this for an undamped wave test problem, the amplitude of the solution converges 
to the CN solution from above, so the algorithm is numerically unstable for any finite number of iterations. 
One can demonstrate this by considering the test problem = ujcj), Q = — w0i T^ri = T^v = 0, finding that 
the squared amphfication factor for the first iteration gives | = ((V'"'*'"^'^)^ -t- (0"+-^'-^)^) /((V'")^ + ('Z'")^) = 
1 + 2tj^Ai^/(4 + tD^AP). For high frequency modes that we are trying to treat implicitly, this becomes 
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\A\'^ w 1 + 210^ jOP' , which is significantly larger than unity. While further iterations would converge towards 
the Crank-Nicolson solution, which has \A^ = 1, this significant numerical amplification would still be 
problematic, and would require that the semi-implicit operator be an extremely good approximation to the 
real dynamics in order to converge quickly. Furthermore, it is somewhat bothersome that this algorithm 
would give such a large numerical amplification, since it is supposed to be an approximation to the stable 
Crank-Nicolson algorithm given by equations ((9 Hl0|l . However, note equations (|19ll20ll were derived using 
equation (fTB|) . which is valid only for p > 0, so an expression for the initial p = step of the iteration 
requires further consideration. It turns out that a more careful treatment of the initial step will also make 
the final algorithm stable. Using equation (fT4| in equation ((20|) for p = 0, we have: 



(21) 



~2 



Let us define the quantity 0"+^'* as: 



which reduces equation (pi]) to: 

^n+1,1 ^ ^„+l,* ^ _ ^n+lfl^ (23) 

Substituting this into equation (jlSp evaluated at p = 0, and using equation p4p . yields 



where again the last term on the RHS can be simplified making use of approximation (|18p . Equation 
followed by equation (|20p (evaluated for p = 0) would give a 1st order accurate prediction of (-0"+^'^, </)"+^'^). 
This could then be iterated once with Equations p9l420|) to obtain a second order accurate calculation for 
(•^"+1,2^ 0"+i,2^_ However, a small modification to the prediction step can give a 2ed order accurate results 
for (?/)"+^'^, 0"+^'-'^). This is done by replacing in the nonlinear term of equation with a 1st order 

accurate approximation, ■0"+^'*, so that equation (IM)) becomes 

=^.F(0"+i'*,i^"+i^*) - ^^^^ (^"+1.1 _ ^"+1.0) ^ (25) 
where is defined as: 

^n+l,* ^ e-P.At^„ ^ ^g-I'„At_^(^„^ ^ ^_^(0"+l,O^ V"+''°)- (26) 

Note that the replacement of JF((/)"+^'*, ■!/'"+^'°) with jr(0"+i'*, '(/;"+i'^) is consistent with the assumption 
of equation (jl3p that the dependence oi T on ^ is relatively weak, but is useful to ensure second order 
accuracy of the calculation of ■!/'"+^'^. Similarly, when calculating 0"+!'" from equation (pn)) for p = 0, the 
term fJ(0"+^'°, '0"+i'^) is replaced by C/((/)"+^'*, -0"+^'^). This gives a 2ed order accurate result for 
The choice = -0", 0"+i'O = 0" closes the specification of our semi-implicit scheme. 

The final equations for the iterative SI algorithm can be written in a compact form as follows: 
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At 



-X'„At^ 



/ n+l,p 



At 



At 



(27) 
(28) 

(29) 
(30) 



where now = -,/;"+i^o ^ ^n+i,* ^nd ^»+i'0 = ^"+i.p for p > 0. 

We mentioned above that these equations ought to be iterated until convergence, at which point the last 
term on the RHS of equation (|29p is negligible and these equations become equivalent to the fully implicit 
equations ((9l-fT0|). The relative importance of this term can be quantified by: 



■'ji 



n+l,p+l 
ji 



(31) 



where £ = Qp-bd^ j^, j, i are the Fourier (or real) space grid point indexes and N is the total number of 
grid points. The timestep of the integration and/or the number of p iterations is thus determined by the 
requirement: 



max \£ 



p+i| 



< e 



(32) 

where e is the user prescribed accuracy. We emphasize that this error is a measure of the convergence of 
the iteration scheme to the Crank-Nicolson difference equation. It is not a direct control of the overall error 
in the integration (which, when converged, will be that of a CN scheme). For convenience, we also define 
the total number of iterations of equations ([291430]) as Pmax- I-e., Pmax is the number of times the semi- 
implicit operator is inverted in corrector steps. Pmax = 1 evaluates equations (P^HHO]) once, for p = 0, to 
determine (V'""''"'^'^, c/>"^^'^). Pmax — 2 evaluates equations (PM5D|) twice, for p = 0, 1, and calculates up to 

There are several variations of SI methods and their derivations in the literature, some using a more 
heuristic approach of adding and subtracting an operator to the RHS of one of the equations, and some 
taking a more systematic approach to deriving an SI operator. Also, some SI algorithms have used a leap- 
frog-like staggered time grid. Our final result is closest to certain SI methods using non-staggered grids. For 
example, our equations ((27l - [30|) iov p^ax = 1 are equivalent to Eqs. (11-14) of [11]. The formulation presented 
here makes it clear how to extend other semi-implicit methods to use additional iterations if desired (in this 
regard, note that it is important that the definition of is different on the first p = corrector step 

than in later iterations). These additional iterations provide a simple way to turn a semi- implicit method 
into a fully implicit method, and further demonstrate the relation between SI ideas and preconditioning for 
iterative methods, as discussed in section [2^ 



2.2. Linear stability analysis 



We will now show under which conditions the method described by equations (PTHSD]) is unconditionally 
stable with respect to the timestep At. For simplicity, we neglect the diffusive terms (since these terms are 
treated implicitly, they have a stabilizing effect). In line with the assumptions of (|13p . we assume that, in 
the linear regime, the following is true: 

^(0,V)«/0, 

5(0,V)«#. (33) 
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Taking the simplest case oi pmax = 1 (i-c, a predictor-corrector scheme) equations 
in matrix form as: 

At/2 g 



1 -At/2 g 




"0"+i,r 




1 









1 

Atf 



1 









r _ 



i + c 2(1 + /:) 

where C = lj'^ (At)'^ / 4. Inversion of the matrix on the LHS yields the equation 



can be written 



(34) 



"^n+i,r 








= A 








r _ 



(35) 



where A is the amplification matrix: 



A = 



/.gAt2 Atg {fgAt^ + iC + i) 



2C + 
Atf 

C + 1 



1 



4(£ + l) 

fgAt' 

2C + 2 



(36) 



The method is unconditionally stable when the eigenvalues X± of the matrix A are such that |A±| < 1. Using 
fg = —Lo'^j we obtain: 



A. 



4 + (cj2 - 2uj^) At^ ± 2iwAV4+ (t2;2 -tj2) 

4 + At2w2 



(37) 



It is straightforward to show that this algorithm is symplectic, |A±| = 1, if the argument of the square root 
is positive. Thus we have unconditional stability for arbitrarily large uAt if Cu'^At^ > uJ^ At^ — 4, or in the 
relevant limit of to At ^ 1, ii di'^ > uP' . Since the magnitude of the error at each iteration, expression (pij) is 
proportional to Qp' , it is clear that setting Qp w uP (but Cp > oP) is the best choice for stability and accuracy. 
(Of course, nonlinearly uP is not known, so one wishes to set Qp to be an approximate upper bound on the 
true operator up .) Here we have demonstrated that the first corrector step, ymax = 1, gives a solution that 
is symplectic (at least for this linear wave test problem). Additional iterations would eventually converge to 
the Crank-Nicolson solution, which is also symplectic. We have also proven that the algorithm is symplectic 
for arbitrary pm.ax, as discussed at the end of the next section. 

Finally, note that a Taylor series expansion of expressions ([37]) yields: 



A+ = 1 ± iujAt - 



±0{At^), 



consistent with the fact that the first order accurate predictor step of equations 
corrector iterations are 2cd order accurate. 



(38) 

ensures that the 



2.3. Linear convergence rate 



To determine the convergence rate of the iterative scheme presented in equations 



we set: 



in+l,p+l _ JJi+l 



n+l,p+l 



--i'' 



n+l 



(39) 
(40) 



where the first term on the RHS is the converged solution and 50"+^'^+^, are the error at each 

iteration. Again using the linear approximations of equations (|33p . it is straightforward to obtain the equation 



At^fg AtgC 



4(£ + l) 2(£ + l) 
Atf C 

2(/: + l) £ + 1 



(41) 
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The matrix on the RHS has the eigenvalues 



Clearly A2 is the eigenvalue of interest. In the limit of iuAt 3> 1, this expression simplifies to: 

A2 - 1 - (43) 



We showed in the previous section that o)^ « is the best choice in terms of achieving stability while 
minimizing the error (PT|) at each iteration. Expression shows that this is also the best choice in order 
to maximize the convergence rate. However, the iteration will converge as long as the semi-implicit operator 
is large enough. The stability condition for the first iteration, a)^ At^ > oj'^AP — 4 from the previous section, 
is also a sufHcient condition for convergence. 

In the previous section, we demonstrated that the algorithm for Pmax = 1 is symplectic. We have also 
shown this is true for arbitrary Pmaxi and here we summarize the main steps. First write Equation (|4ip . 
which holds for Pmax > 1, in the vector-matrix form = B(5y"+^''' = (B)^ where (B)p 

denotes the matrix B raised to the p'th power (as opposed to an index superscript). Expressing the converged 
Crank-Nicolson solution as = Cy" (where the matrix C can be easily calculated), and the Pmax = 

1 solution as = Ay" (where A is given by Equation ([55]) ). this can be written as y"+i'P+i = 

[C -I- (B)P(A — C)] y" = Apy". The eigenvalues of the matrix Ap can then be calculated using a symbohc 
mathematics package for simplicity (we used Maple [2lj for this calculation). To reduce the complexity 
of intermediate equations, it is useful to express (B)^' = E(D)pE~^, where the columns of E are the 
eigenvectors of B and the diagonal matrix {'D)p = (0, 0; 0, (A2)'') (without yet substituting the actual value 
of the eigenvalue A2 from Equation (|42p). One can then show that the eigenvalues of Ap can be written 
in the form Ap^± = ci ± i^fca^ where c\ is real and a sufficient condition that ci be positive is again the 
stability criterion from the previous section, cD^At^ > w^At^ — 4. The terms c\ and C2 satisfy ci = 1, 
thus showing that this algorithm is symplectic for an arbitrary number of iterations Pmax- 

2.4. On the relation between the iterative semi-implicit method and a preconditioned Jacobian-Free Newton 
iterative method 

To investigate some of the properties of this algorithm, let us for simplicity consider now a more compact 
vector form provided by the following equation: 

^ = - P^V, (44) 

where ip = ipi{x^t) may be a vector field (with components i = 1..N), is a nonlinear operator, and 'Drj 
is a linear operator (which represents dissipation in our problems). Introducing the variable transformation 
^{t) — e^-^''*^(t), this becomes 

^ = e^''*.F(V) = Hi^). (45) 

(A definition is denoted by =.) For simplicity we will drop the tildes over variables in the rest of this subsec- 
tion. It is not hard to see that, in this case, the numerical scheme of equations ([291430]) (using equation ([13]) 
to approximate J-') is equivalent, for p > 1, to: 



At r 



J^(V'") + T{tP"+^'P) + - V'^+i-P) , (46) 



where J- is the semi-implicit operator, and therefore, chosen to be invertible. Equation (|46p can thus be 
easily manipulated to yield: 



At 



(47) 
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Wc now show how the above equation can be obtained with a Jacobian-free Newton algorithm coupled with 
a preconditioned simple functional iteration. A Crank-Nicolson discretization of equation (j45p leads to: 



(48) 



As mentioned before, directly inverting the LHS of this equation is often hard and impractical. A common 
and useful approach is to resort to preconditioned iterative methods. For nonlinear cases like this, it is 
useful to use an outer Newton iteration for the nonlinear part of the problem, and an inner iteration with 
preconditioning for the resulting linear part of the problem. I.e., introducing a Newton iteration count p, 
the above equation becomes: 



n+l,p+l 



Hi' 



ST 



ii 



n+l,p+l 



n+l.p\ 



(49) 



Because SJ- /Sip is a complicated, large sparse matrix, it is still impractical to directly solve this problem. 
If we have an approximation to SJ- /Sip, denoted by J-, that is not too hard to invert, we can use this as a 
preconditioner, and write equation (j49p as 



At 



AtS^ 



n+l,p 



(50) 



This is of the form A^-'^ ^7/'"+^'''+^ = b''. If the preconditioner is sufficiently good, then A^^A is close 
enough to the identity matrix that a simple functional iteration should be sufficient. Using an index q to label 
these iterations within each Newton step, this would become = 5p_(_(i_^-i^)^"+i.p+i.'?^ -^^ith 

the initial condition ■(/;"+1'P+1'0 = ip^+^'P. This should converge as long as the magnitude of the eigenvalues 
of (1 — A^^A) = A^^{A — A) are all less than unity. If we do only one inner functional iteration {q = 0) 
per outer Newton iteration, then this exactly reproduces equation (|47p . [Alternatively, replacing SJ^/Sip 
in equation (j49p with the approximate Jacobian J- leads to equation (j47p .] The Iterative Semi-Implicit 
algorithm presented in this paper is essentially this, with a particular physics-based form for J- appropriate 
for high frequency waves arising from coupling between differential equations, and a modification of the first 
step to give a symplectic second-order accurate result after one corrector step. 

More advanced iterative methods could be used, such as Krylov methods like GMRES or restarted Loose 
GMRES [i^l, and will accelerate convergence for some types of problems. (Also, because of the various 
approximations made, our simplified algorithm will not have the asymptotic quadratic convergence of a full 
Newton algorithm on the nonlinear part of the problem.) However, we have found that simple functional 
iteration works quite well for our problems, converging in just an iteration or two. In part, this is because 
there is very little energy contained in the high frequency waves in these problems. In principle, convergence 
might be slow if there were a significant component of the error vector in these high frequency waves, 
for which A~^{A — A) has the largest eigenvalues. But there is very little physical coupling between the 
low frequency dynamics of interest and the high frequency waves, so if they are treated in a numerically 
stable way, they will have been damped away on previous time steps and do not require many iterations on 
subsequent time steps. 

Although we use a simple functional iteration, note that the optimal method for inverting the physics-based 
preconditioner (1 — ^J^) itself might in some cases be an iterative algorithm (more on this in section !??^ . or 
in other cases with FFTs or a direct sparse matrix solver. For our application, we are already using FFTs for 
a pseudo-spectral evaluation of the nonlinear terms, so it is easy to implement and very efficient to directly 
invert our physics-based preconditioner in wave-number space. 
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2.5. Second order operator splitting 



In the derivation of the numerical scheme of section l^TTl the nonhnear advection operators and the diffusion 
operators are treated differently. Operator splitting methods can give rise to splitting errors in numerical 
schemes, resulting in an overall order of the scheme which is lower than that of any of the individual methods 
applied to treat each of the operators. In this section, we show that our scheme is second order accurate 
even if these operators do not commute. To demonstrate this point, it is sufficient to consider the paradigm 
posed by the compact vector form of equation ((44|) . Crank-Nicolson discretization of equation (j45ll yields 



n+l 



At 



Undoing the variable transformation and rearranging gives 



(51) 



(52) 



Through second-order accuracy, the LHS can be expanded as LHS==^/'"+i - (At/2) + (ajf"/a7/')(V'"+i 

—?/;")]• Then the above equation becomes 



AtaF\ 



71 + 1 



At 



From a second order Taylor-series expansion, ip{t + At) = i/'(t) 
to equation (|^^ through second order accuracy should be: 



At 

At^' ■ 



AtdJ" 



r 



(53) 



2 dip 

^At^ip" , we see that the solution 



n+l _ 



V'" + AtT{i''') - AtVr,ip' 
At^ 



(54) 



Inverting the operator on the LHS of equation ([55)1 and expanding, one can show that equation ([55]) 
agrees with equation through second order. Thus we have demonstrated that, at least in the limit 
of AtdT/dip <C 1, our splitting scheme is second order accurate even if the advection {T) and the diffusion 
(2?^) operators do not commute. Numerical demonstration of the second order convergence of the algorithm 
for our test problem will be presented in section |4l where it will be seen that the algorithm remains second 
order accurate even at values of At considerably larger than those required by an explicit integration. 



2.6. Robust damping 



Let us consider a simple test problem based on a linear scalar version of equation (|44p with J- = —iuj and 
V^i a real number, 

^ = -(ic^ + I?^)V. (55) 

The analytic solution is straightforward: 

4' = ^06"^"^+^"^*, (56) 

where "00 is the initial condition. A plot of the magnitude of this solution is shown in Figure [1] for the 
parameters lo = 10, ipo = 1 and P,, = 1 (full black line). Also shown are the numerical solutions of equa- 
tion (|55l) with three different numerical methods: in red (dashed line), the solution yielded by equation ([52)) 
(i.e., the method described in this paper) ?/>"+^ = (1 -|- ia;At/2)~-'^e~-^''^*(l — ia'At/2)-0" ; in green (dotted 
line) the result obtained by the analytic integration of the diffusive exponential, as described in appendix [Al 
and, in blue (dash-dotted line), the solution given by a Crank-Nicolson discretization of the RHS. As seen, 
for LoAt < 1, all three methods reproduce the exact solution. However, only the discretization of equa- 
tion (j52p is capable of capturing the behavior of the exact solution for any value of At, as is desirable for 
a robust treatment of damping with implicit integration, where P^At ^ 1. One might try to improve the 
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Fig. 1. Plot of the solution of equation I I55I I as obtained analytically, expression 1561 1. and with three different numerical methods. 
Input parameters are ui = 10,1'r; = 1 and i/jq = 1. 



CN approach by using CN for the wave term and backwards Euler for the damping term, i.e., discretize 
equation ([55]) as - = + V'")/2 - 2?r,V'"+^- However, in the hmit At ^ oo this gives 

^n+i _ _ i2'Drj), which for I?^ <C lu gives very little damping, even though V^jAt >> 1. 

The algorithm presented here preserves the property of robust damping, which mathematically is termed 
L-stability. An algorithm that is stable for the equation (|55|) for arbitrary At (for 2?^ > 0) is called A-stable. 
An algorithm is L-stable if it is A-stable and has the property of the exact solution that -;/;"+ ^ /■)/)" — as 
At oo. It is well known that Crank-Nicolson is A-stable but not L-stable. In physical terms, while Crank- 
Nicolson has the nice property of giving no artificial damping of pure wave problems (i.e., it is symplectic for 
V,/ = 0), it has a difficulty in not giving significant damping for problems that should be strongly damped in 
the large At limit. Examples of L-stable algorithms include the 2cd order Backward Differentiation Formula 
(BDF2), and certain Rosenbrock or Implicit-Explicit Runge-Kutta algorithms. The iterated semi-implicit 
method presented here is based on a kind of integrating-factor variation of Crank-Nicolson that makes it 
L-stable. (An algorithm based on the Modified Crank-Nicolson algorithm 2^, 24 1 could be another way to 
improve the treatment of damping.) Alternatively, one could develop an iterative semi-implicit algorithm 
starting from BDF2 instead of CN. This would be automatically L-stable and would treat the damping and 
other terms in a balanced way that might have certain advantages for some applications. On the other hand, 
BDF2 introduces a weak numerical damping of resolved waves, /■!/;"' | « (1 — {u!At)'^/4: + 0{At^)), while 
the algorithm presented here has the advantage of retaining the CN property of being dissipation free in the 
limit = 0. Future work could investigate the relative advantages of these different approaches. 



3. Application gyrofiuid magnetic reconnection 



The field of plasma physics is particularly rich in stiff wave problems. One such application is magnetic 
reconnection 2^, 2^ , a phenomenon in which magnetic field lines that arc being convcctcd with the plasma 
fiows suddenly break and reconnect in a different configuration. Magnetic energy is released in this process, 
giving rise to high velocity plasma flows, energetic particles and plasma heating. Reconnection is the cause 
of solar flares [23, [1^; it is also manifest in the interaction between the solar wind and the Earth's magnetic 
field [2^ in the magnetopause [3(| and magnetotail [3l|. In fusion devices it plays a crucial role in the 
development of the sawtooth instability, which can be detrimental to the plasma confinement [3^ . 
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In the absence of external forcing, magnetic reconnection arises as a result of a well known plasma instabil- 
ity called the tearing mode (33j. Regardless of the physical framework [i.e., resistive magnetohydrodynamics 
(MHD), Hall-MHD ^ I. kinetic, etc.], this instability is particularly hard to simulate numerically (see, e.g., 
0, [HI)- The reason is essentially one: the intrinsically multiscale character, both temporal and spatial, of 
magnetic reconnection. With respect to its time evolution, the problem arises because the typical growth 
rate 7 of the tearing instability is much smaller than the frequency of the fastest waves present in the system 
(e.g., Alfven waves. Kinetic Alfven waves, whistler, etc.); spatially, this problem requires the simulation of at 
least two, but often more, spatially distinct scales: the large scale equilibrium and the short scale dissipation 
layer. The separation between these two scales is a function of the magnitude of the field-line breaking terms 
(e.g., resistivity, electron inertia) which, in order to address a physically meaningful region of parameter 
space, have to have extremely small magnitudes, and thus become important only in very narrow regions 
in the plasma where strong gradients arise. It follows from these conditions that numerical simulations of 
magnetic reconnection require extremely high spatial resolutions and either an implicit time integration 
scheme or, if an explicit scheme is adopted, timesteps which are much smaller than I/7. 

In many plasmas of interest, such as those in fusion devices, or at the magnetotail or solar flares, it 
has long been suspected that an MHD description is too simple to fully explain the complexity of the 
observations. The discrepancy is obvious in the observed and calculated reconnection rates which, in the 
MHD framework, differ by several orders of magnitude. Interest has diverged to non-MHD effects which 
might cause a speed-up of this process 34, 35, 3^. The potential candidates differ depending on the specific 
geometry and plasma parameters. For example, in most present fusion plasmas, the ion Larmor radius 
exceeds, or at least is comparable to, the width of the dissipation region. Thus, finite ion Larmor radius 
(FLR) effects cannot be neglected and are, in fact, known to produce a speed up of the reconnection 



37| . From the numerical point of view, the FLR terms bring about one further complication: their 



rate 

inclusion in the equations fundamentally changes the dispersion relation of the shear Alfven wave, essentially 
generalizing it to the so-called Kinetic Alfven wave (KAW), which has a high frequency dispersive character, 
i.e., w ~ fc^, where w is the wave frequency and k± the wavenumber (the counterpart to the FLR effects in 
the absence of a strong magnetic field is the Hall term in Ohm's law, which also introduces a high frequency 
dispersive wave, the whistler ~- see for a detailed discussion of the relative roles of these waves). Explicit 
numerical integration schemes are forced to resolve the highest frequencies present in the system, i.e., they 
must integrate on a timestep At ~ l/ujmax- Therefore, when dispersive waves are present. At ~ max- 
Since k±_j-nax ^ 1/^2:, where Ax is the grid-spacing, the time step becomes impractically small as higher 
resolutions are required to resolve the extremely thin reconnection layer. Reconnection studies with FLR 
or Hall terms typically use hyper-diffusion (or hyper-resistivity) to limit how thin the reconnection layer 
becomes, which helps limit how small the grid spacing Ax needs to be, and thus allows the time step At 
to be larger than it would be otherwise. There can be a certain amount of physical hyper-resistivity in the 
Ohm's law due to electron viscosity, and the macroscopic physical results arc found to be independent of 
the choice of hyper-diffusion over a range of small hyper-diffusion, but if the hyper-diffusion is too large it 
can affect some of the physical results of interest. For typical values of hyper-diffusion used in simulations, 
the required grid resolution is still fairly fine and the resulting explicit time step limitation is still quite 
severe, so one is motivated to search for an implicit or semi-implicit algorithm to bypass this severe time 
step limitation. 



3.1. Equations 

To study FLR effects in magnetic reconnection we chose a 2 field gyroffuid model, described by the 
following normalized equations: 



^ Hall-MHD consists in extending the usual MHD Ohm's law to include the so-called Hall term, J X B. 
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+ [0, Tie] = [V', ViV^] + u\7\ne - L'H^ine, (57) 

^ + [0,7^] =p2[„^,^] + ,,Vi(V - ^'e,) - ??hV1(V - ^e,), (58) 



dt 



1 



fo(6)-l (59) 



Here ne is the perturbed electron density, (j) represents the electrostatic potential, ip is the magnetic flux, 
related to the in-plane magnetic field by B± = x ^tp, [P, Q] = dxPdyQ — dyPdxQ is the Poisson 
bracket and pi and ps are the ion and the ion-sound Larmor radius, respectively. Auxiliary quantities are 
the parallel current density, jy — where = d'^ + dy, and the flow velocity, Vj_ = x V(/). 

Dissipation is provided by the viscosity and the resistivity rj. We also add, for physical reasons and 
numerical convenience, phenomenological hyper-diffusion coefficients, and tjh 3^, 3]. In equation 



the equilibrium is subtracted from the nonideal terms on the RHS in order to impose an ideal equilibrium 
[this is equivalent to imposing an external electric field, E^xt = {'il'^'± ~ Vh^±) i^eq] - see appendix IB] for 
how to include this term in the numerical scheme of equations (j27H30p . Time is normalized to the Alfven 
time, lengths to the equilibrium scale length. 

These equations are reminiscent of the more complete models of Snyder and Hammet_t_ 41| and Schep 



et al. [44I. Equation ([55]) is known in the literature as the gyrokinetic Poisson equation [43|. The integral 
operator Tq expresses the average of the electrostatic potential over rings of radius pi. In Fourier space, this 
operator simply becomes 

Toib) = e-'loib), (60) 

where b ~ k'^pf and Iq is the modified Bessel function of zeroth order. In other words, within the gyrokinetic 
work frame, the particle is allowed to experience different values of the electrostatic potential as it orbits 
the magnetic guide field. The purpose of the Tq operator is to average (f) over such orbits and over the 
Maxwellian velocity distribution. When implemented numerically in real space, a Padc approximant of this 



operator is often used [4J|: 



Toib)-l^-j^^, (61) 

which converts equation (|59p into: 

(1 - pfVl) n, = Vl<t>. (62) 

In our case, equation (j59p is solved in Fourier space and thus no approximation to rn(&) is required. In 
the limi t k \ pj — > 0, the above set of equations reduces simply to the Reduced MHD (RMHD) model of 
Strauss [ij 

Sets of fluid equations similar to the above, or their coUisionless version, which further include electron 
inertia, have been in use for quite some time to study magnetic reconnection (e.g. Kleva et al. [3^). The 
particular form presented, however, differs from most in that it also includes the effect of finite ion temper- 
ature (i.e.. Pi 7^ 0) and is unique, so far, in magnetic reconnection studies, in keeping the full form of the 
Gyrokinetic Poisson law, equation ([55)) . instead of its Pade approximant, equation ([5^ (see, e.g., Grasso et 
al. 0). 

For simplicity, let us for now ignore the diffusive and hyper-diffusive terms and consider an equilibrium 
described by B^^^q = BoBy, with Bq a constant, rie^eq = and 0eg = 0. We want to linearise equations (j571 - 
I59|) assuming that all the fields can be written as x = Xeq + X^i^^ ?/)i where xi represents small perturbations 
to the equilibrium of the form x^i^^u) ~ Xi (a;)e''^''^. Dropping the subscripts, equations (|57H58p become: 

^=9"., (64) 

where 
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f- 



g = -ikyBo - 



ro(6)-i. 

Combining these two equations, we have: 
and similarly for n^, where io^ is given by: 



-fg 



To{b) - 1 



(65) 
(66) 

(67) 

(68) 



This is the general dispersion relation for the Kinetic Alfven wave (KAW). In the limit of k^pi ^ 1 this 
relation reduces to: 

'3 1^ 



1 



R^ 



(69) 



where r = Ti/T^. The first term on the RHS constitutes what is usually referred to as the shear Alfven wave, 
and the remaining fc^p^ terms are the FLR corrections. [Note that |fc • Sggp = ^yB^^y = k^Bg because 
fc^ = in this 2-D problem and the guide field i3eg,z only enters through the gyroradii pi and Ps-] The 
opposite limit of k±pi^ 1 gives: 

u;' = klplklBl (70) 

where p^. ~ pl + pf. The numerical difficulties faced in the explicit integration of the set of equations 
at fcj^pi > 1 are obvious from expression (j70p . where we basically find that lo cx k±^ky cx k^. 



3.2. Semi- Implicit operator 



Based on the linear dispersion relation, equation ([68|) . a plausible first choice for the SI operator up' could 
be: 

-^ = fci(p^-j;^(|n)^^oSo, (71) 

where oq is a dimensionless constant, set to a value that ensures the stability of the algorithm and, to account 
for a spatially varying equilibrium magnetic field, B^q = Beq{x), Bq can be chosen to be its maximum value. 
However, in the nonlinear regime, the magnetic field will no longer be a function of x alone, but rather an 
arbitrary function of x and y. This suggests a qualitative generalization of this expression in order to also 
apply to the nonlinear regime: 

where we simply have made the transformations ky k±_ and Bq — > B±^rnax, where we define B±_„iax = 

max{^ B'^ + By) is the maximum magnitude of total in-plane magnetic field, updated at each new timestep. 

Alternatively, one could try developing an SI operator that included the spatial dependence of the magnetic 
field, but it would no longer be analytically invertible in k, and would require alternative solvers that are 
not available in our code at present. (For example, some rational function approximations for an SI operator 
that included spatial variation of Bq might be efficiently invertible in real space using a multigrid solver. 
Sec Ref. [llj for some other possible approaches to preconditioners / SI operators for problems with strong 
inhomogeneities, anisotropics, and multiple waves.) Of course, for problems in which the magnetic field has 
spatial variation, as in the application of the next section, equation (|68p is no longer an accurate dispersion 
relation for the system (i.e., it is no longer an eigenvalue of the linear problem, and Fourier modes are no long 
eigenvectors). Nevertheless, as we shall see in sectional the SI operator based on a constant Bo is found to be 
fairly effective for the case considered there. (In part this may be because stability of the SI algorithm only 
needs the approximation uj'^ to be sufficiently large. While an SI operator that is larger than necessary may 
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require more iterations to converge from an arbitrary initial guess, this can be offset if the initial predictor- 
corrector step is sufficiently accurate.) In places where k±pi ^ 1, this operator simply reduces to the spectral 
equivalent of the real-space operator proposed by Q for the whistler wave. Note, however, that whereas 
k±Pi will certainly be very large in the vicinity of the X-point, it will also be negligibly small far away. It 
is therefore very useful to retain the full form of the Tq operator in the expression for d). Because our code 
works in Fourier space, this operator is trivial to implement without resorting to any approximations. Since 
it does not require any convolution, the computational cost of evaluating expression (|72p at each timestep 
is negligible. In the conclusions we address the issue of how it might be generalized to a real-space grid 
implementation. 

An important property of this SI operator is its ability to act on each value of k±_ individually. For instance, 
the magnitude of tD is small for the lower values of k± , meaning that the large scale features are left relatively 
unaffected. Also, it is well behaved in the limit of k±pi ^ 1, where it reduces to 

'3 1~ ' 



1 + fcip; 



Oq-Bj^ j„(j2., (73) 



a form which makes transparent the operator's ability to equally stabilize the shear Alfvcn wave. 

Finally, we note that the substitution ky — > k± in the expression for the SI operator implies that during 
the linear and early nonlinear regimes the parameter uq can be set to values which are smaller than 1. This 
is important since, as mentioned, we want the operator a) to be sufficiently large to satisfy the stability 
condition > (and ideally, o)^ as close as possible to for all wavenumbers), but not so large that 
convergence is slow and the error defined by expression p2|) (which is proportional to Oq) is larger than it 
need be. 



3.3. Semi-implicit error control and timestep determination loop 



An important part of the algorithm is the semi-implicit error control and timestep determination loop. 
The basic idea behind it is to enforce the limit on the semi-implicit error prescribed by the user, 

max|£:P| <£:"°^ (74) 

while maximizing the timestep that can be taken. This is done the following way. The discrctized set of 
equations (|27H30p is iterated until p = Pmax- Then, the following conditions determine the next step: 

- (a) £ < O.S^""^ : At"+i = C+Ai" 

- (b) 0.8 <£ < : At"+i = At" (75) 

- (c) £ > : loop is repeated with At"+i = At" 

where > 1 and < 1 are constants. The goal is to keep the error below but as close to the maximum as 
possible. Based on the second order convergence of the numerical scheme, we set C+ = 1.08 and ~ 0.92. 
At every timestep, the CFL condition for the plasma flow velocities is also evaluated, according to: 

^^cfT = 0-1 

The next timestep is the minimum of expressions ()75ll76p . It is important to note that the error control and 
timestep determination loop just described makes the choice of oq less vital to the algorithm than otherwise. 
The reason is that if the calculation is started with values of oq and At such that the SI scheme is unstable, 
this will cause the error to diverge until it reaches the maximum allowed, at which point A< is decreased. 
This cycle will be repeated until a sufficiently small At is found for which the SI scheme is stable at that 
(fixed) value of oq. Of course, if ao is set to too small a value, stability of the SI algorithm might only be 
available at values of Ai much smaller than those which can be achieved at larger oq. Finally, the reader is 
again reminded that this error is not the total error of the integration, but simply that introduced by the 
semi-implicit operator. When the iterative procedure has converged, there is still an underlying error which 
is that of the Crank-Nicolson scheme. 



Ax Ay 



(76) 
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4. Numerical Tests 



As mentioned, we have written a fully parallel pseudo-spectral code to evolve equations (|57H59)) . An 
explicit time-stepping version of this code has been in use for some time to study the evolution of the 
resistive tearing mode in the single fluid MHD limit {pi,Ps = 0). The code uses a Fourier basis set on 
a 2-dimensional domain with periodic boundary conditions. Details and extensive benchmarking can be 
found in [H^l; see also jsij for recent results. The explicit algorithm uses a modified version of the Adams- 
Bashford 3rd-order method that allows for variable timesteps [s^] , combined with a Crank-Nicolson scheme 
for the dissipative terms (in this case, the use of a CN scheme for these terms is fully justified because 
rjk'^At ~ 77 ^ 1, since At ~ 1/ujkaw ~ 1/^1)- The timestep is determined by the CFL condition, which is 
evaluated at each timestep according to the formula 

^ ^ . \ Ax Ay Ax Ay 2 1 
At^O.lmin , ,75-^' ' (77) 

where ujKAW.max is an estimate of the maximum frequency of the KAW on the grid, defined as: 

^KAW,raa.x ~ ^X.max I Ps ~ n 7 ) ^j^, max -S^, max (78) 

The code also allows the resolution in the j/-dircction (i.e., perpendicular to the equilibrium direction) to vary 
throughout a run, as needed. When more resolution becomes necessary in this direction the run is stopped 
and the number of grid points doubled. The new data values necessary for this procedure are obtained via 
linear interpolation between two adjacent points. This is a very useful feature for reconnection runs, as the 
needs for resolution in the y-direction increase greatly from the linear to the nonlinear regime. 

The base-case chosen for comparison between the SI method of equations ((771 - [5(I)) and the explicit code 
is characterized as follows. The equilibrium is defined by '^^'^^ = ^po/ cosh{x)^ and 0^°^ = 0. We set tpo = 
3-\/3/4 so that the maximum value of the equilibrium magnetic field is 1. Other parameters are the resistivity 
and the viscosity coefficients, set to 77 = i/ = 5.10"'', and the instability parameter A' = 17.3 (i.e., the 
simulation box size is Lx = 27r, Ly = 2.187r). The ion and the ion-sound Larmor radius, pi and ps, are set 
to 0.02. The hyper-resistivity r]H aud hyper-viscosity vh are grid dependent, and their value is calculated 
at each timestep, according to the formula: 

r]H = VH = 0.1 WKAW,max/fci_max- (79) 

Tests have been performed to insure that these coefficients are sufficiently small not to alter the physics 
of the system. The tearing mode instability is initialized by perturbing this equilibrium with -i/)^^-* = 
— 10~^ cos(27r/Ly y). The initial resolution is 3072 x 128 grid points; the number of grid points the y- 
direction is doubled throughout the run, up to 2048 for t >« 210. The most crucial test is to check how 
closely the semi-implicit method described in this paper reproduces the results obtained by a conventional 
explicit integration. This comparison is performed in Figures ([5H3]). Figure [5] shows the growth rate, defined 
as 7 = dlogipx /dt, where X is the location of the X-point, obtained by the two methods. The agreement 
between both approaches is remarkable. After an initial transient {t <~ 30), the tearing instability is seen 
here to evolve through four different stages: the linear stage (constant growth rate), the early nonlinear 
period, from t w 130 — 170, followed by the explosive stage, during which the growth rate dramatically 
increases {t w 170 — 220) and finally the growth rate slow down and saturation period. 

A comparison between the contour plots for all the fields at the early stages of the explosive period 
{t ~ 204) is shown in Figure [31 Again, the figure yields excellent agreement between the explicit and the 
semi-implicit approaches. 

Plotted in Figure [Jlja) is the CPU speed-up factor, determined as follows: 

CPU _ At 

CPUexp ~ {NRHs/2)At,xp ^ ' 

where NfjHS is the number of evaluations of the RHS required at each timestep for the SI scheme to converge, 
compared to an explicit scheme that requires only 2 such evaluations per timestep. CPUexp and Atexp are 
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Fig. 2. Comparison between the explicit and SI results. The Figure shows the growth rate of the tearing instability, 
7 = dlogrpx /dt, obtained by the explicit method (dashed black line) and the SI method with £"^"-^ = 10~^ (dotted red 
line). The two curves show excellent agreement and visually overlap exactly. 



the explicit integration CPU time and timestep, respectively. As mentioned, convergence is determined by 
£ < 10^^ at each timestep. A plot of the time evolution of the maximum error is shown in Figure |4ljb). 
The dashed lines in this figure identify the "comfort zone" for the SI error, defined by expression ((75|) . We 
see that for both values of Pmax plotted, the error is indeed largely confined to that region, indicating the 
second-order convergence of the iterative scheme. The large undershoots mark the points at which the run 
was restarted with a larger resolution in the y-direction. At each restart, the At is set to be equal to that 
determined by the CFL condition. As shown in FigureHJa), the SI method is remarkably successful, yielding 
a CPU speed-up factor ranging from a minimum of ~ 20 during the explosive growth period, to several 
hundreds during nonlinear saturation of the instability. We also observe that, in this case, the iterative 
scheme is very efficient, as is shown by the larger speed-up factors allowed for Pmax = 2 (red dashed line). 
The actual timesteps taken by the SI method are shown in Figure [5] We see that the SI method performs 
so well that the minimum CPU enhancement (~ 20 at t « 210) is set by the flow speed, i.e., the actual 
dynamics that we are interested in following (recall that the CFL condition for the flows is imposed as a 
maximum for the timestep). 

4.1. Numerical test of the second-order accuracy of the algorithm 

We discussed the 2cd-order nature of the algorithm in section [21 and in section 12.5! we provided an 
analytical demonstration that the algorithm remains second-order accurate for sufficiently small A< even 
if the diffusion and advection operators do not commute. These arguments arc rigorously valid only if 
AtdJ-/dip ^ 1. However, the goal of implicit and semi-implicit algorithms is to allow usage of very large time 
steps compared to explicit algorithms. Thus, although the dominant modes of interest have eigenfrcquencies 
uj of id J- /dip such that Atcu is presumably small, there will be some high frequency modes for which the 
effective AtdJ^/dip is large. To demonstrate that the scheme can indeed be second-order accurate (at least 
for our test case) at desirable implicit values of At (i.e., which greatly exceed those required by an explicit 
integration), we performed a scaling study of the error of the algorithm with the timestep At, using as a 
test case the application discussed in the preceding section. This study was done as follows: starting at time 
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Fig. 3. Contour plots of the fields measured at an island width W ~ 1.6 (t » 204). Left column show results obtained with the 
explicit integration scheme, right column those obtained with the semi-implicit scheme. From top to bottom, shown are the 
contours of ip, , <j> and n^. 
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Fig. 4. Fig. (a) shows the CPU speed-up factor over a conventional expUcit integration for Pmax = 1,2. The timestep taken by 
the SI scheme is determined by imposing £ < 10^'^, as seen in Fig.(b). The dashed horizontal lines limit the error "comfort 
zone", defined by expression I I75I I. The largest undershoots in both figures (at times in the vicinity of t ~ 200) are due to code 
restarts at those times, when the resolution in the j/-direction is doubled. 



t w 210, when the simulation is fuhy nonhnear (see Figure [2]) and the SI algorithm is using a time step of 
At ^ 9 X lO^"', we run the code for an additional time of 0.135 using a very small time step of At = 10^^. 
This value is smaller than even the explicit timestep At ~ 1.5 x 10"^ that would be required at this stage, 
and we take the result of this integration as our exact result. Then, we perform several runs over this same 
time period of 0.135 with various fixed values of the time step At. To isolate the scaling of the error with 
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Fig. 5. Plot of At vs. time achieved by the numerical scheme of equations (I27ti30t for the test case and for pmax = 1 (black 
full line) and pmax = 2 (red dashed line). Also plotted for comparison are the timcsteps due to each term contributing to the 
CFL condition 1771 1. Note that the timestep taken by the semi-implicit method coincides with that based on the flow velocity 
during the explosive part of the instability. 



At, this scaling study was performed with a fixed number of iterations per time step, Pmax = 2. The error 
is the difference between the results of these integrations and the "exact" result (we have chosen to plot 
the relative error in tp at the X-point). The result is shown in Figure [6l where a clear scaling of the error 
with (At)^ is seen. Note that, at each value of the iteration index p, we have -0"'^ = fptrue + ^sf ~^ ^discr 
(and similarly for (/)), where ^true is the true solution (here taken to be that given by the integration with 
Ai = 10^^), £si is the error introduced by the SI operator, and Ediscr. is the error due to the particular 
discretization scheme used (e.g., when our scheme is converged in the p-iterations, Ediscr. is simply the error 
of the CN method). Thus what is plotted in the figure is the total error, due to both the SI error and the 
time discretization error. In the figure, the vertical dotted line identifies the timestep that would have been 
required for stability by an explicit algorithm at this point of the simulation. The dashed line shows a At^ 
slope. We thus see that the second-order accuracy of the algorithm extends at least up to the typical SI 
values of At ~ 9 X 10~^, which is 60 times larger than the explicit time step limit. 

5. Conclusions 

We have presented an efficient and accurate numerical method for the time integration of stiff wave coupled 
partial differential equations. The method allows for timcsteps which greatly exceed those imposed by the 
CFL constraint, while accurately reproducing explicit calculations. CPU time savings over a conventional 
explicit scheme range from factors of ^ 20 to several hundreds for the test case chosen (gyrofluid magnetic 
reconnection) . The method is second-order accurate even if the diffusion and nonlinear (advection) operators 
do not commute, and exhibits the property of robust damping. I.e., it correctly captures the analytic solution 
in the limit I?,jAt 3> 1, where I?^ is the diffusion operator and Ai the timestep. In the limit of no physical 
dissipation, the method is symplectic (at least for a linear wave test problem), i.e., it does not introduce any 
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Fig. 6. Scaling of the relative error with At. Dashed line indicates (At)^ slope. Vertical dotted line marks the explicit timestep 
limit that would be required at this point in the simulation. 



numerical dissipation. 

The derivation of the SI numerical method presented here relies on the physics based assumption that 
the fundamental wave dynamics arises as a result of coupling between equations. In the linear regime (small 
perturbation amplitude) this statement is exact for many physical systems. Our results demonstrate that 
this same SI operator (scaled to approximate an upper bound on the frequency in the nonlinear regime) 
continues to be effective in the nonlinear regime as well, at least for the test problem explored here. We 
note that the application chosen to study the efficiency of our algorithm supports only one wave (the 
KAW). It is therefore easier to formulate an efficient SI operator (or an efficient preconditioncr for an 
implicit algorithm) for this system that would be the case for a system of equations that supported multiple 
waves (i.e., multiple polarizations at each wave number). While the isotropic and homogeneous SI operator 
used here was fairly effective for the test case we considered, which had some degree of anisotropy and 
inhomogeneity, there may be cases with stronger anisotropics or inhomogeneities where a more complicated 
SI operator or preconditioncr is needed. Some approaches for dealing with these issues (of multiple waves 
with anisotropics and inhomogeneities) are discussed in Ref . . The efficiency of the iterative SI algorithm 
for a multiple wave system remains to be tested. 

One of the advantages of a physics-based semi-implicit operator is that one only has to approximate 
a'^(fc), not the dispersion relation uj (k) itself (and in fact, one only needs an upper bound on to insure 
convergence). Chacon and Knoll [lll.Tl4. have discussed how this kind of physics-based semi- implicit op- 
erator can be related to preconditioning, and how it turns an originally hyperbolic problem into a diagonally 
dominant parabolic problem that can be efficiently solved by multigrid and/or preconditioned Krylov (if the 
preconditioncr is sufficiently effective) iterative techniques. In some cases, the resulting parabolic problem 
can also be efficiently solved with FFTs, which we use here. 

The Iterative SI approach used here provides a way to monitor and control the SI operator splitting 
errors, and essentially turns a semi-implicit algorithm into a fully implicit algorithm. Accuracy, rather than 
stability, sets the time step At. There are several variations of SI algorithms in the literature (such as 
[itI. [TsI. M [tI. [ssI. [lit ) . some using staggered time grids. If only one Pmax = 1 iteration is done, the underlying 
SI algorithm used here is equivalent to reference The iterative approach presented here could also be 
applied to other variations of SI algorithms. This could be a straightforward way to extend other existing 
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codes that use SI algorithms. 

It is pertinent to compare the numerical method developed here to more elaborate implicit schemes and 
preconditioning strategies. Within these, Jacobian-free Ncwton-Krylov (JFNK) methods seem to be amongst 
the most sophisticated and promising (As discussed in section [531 our algorithm has similarities to a 
Jacobian-Free Newton method, but with simple functional iteration instead of Krylov iteration within each 
Newton step.) In this respect, the main comparison factor is the CPU time speed-up over explicit approaches 
- the main purpose of implicit schemes is, after all, to maximize this parameter. Of course, this is highly 
problem specific, and a general comparison is not possible. For a similar problem to the one dealt with in 
this paper, i.e., implicit integration of high frequency dispersive plasma waves, a JFNK implicit scheme was 
developed by Chacon and Knoll The aim of these authors is the accurate implicit integration of the 
whistler wave, which has a dispersion relation of the same type as the KAW, i.e., w ^ fc^. In that work, CPU 
time enhancements by factors of up to a maximum of 30 is quoted. This is roughly comparable to the 
enhancement factors of 30 and higher found for our method, as shown in Figure |4l[a). However, a definitive 
comparison is not warranted at this time, since both the physics model and the problem arc different (Chacon 
and Knoll use a more complex four-field model, the flux-bundle coalescence problem is different than the 
reconnection problem done here, grid sizes differ, there are subtle differences in the definition of the CFL 
condition that influence this comparison, and the amount of speedup depends sensitively on the stage of the 
nonlinear dynamics in our simulation). 

It is interesting to note that the case in Table I of [ll| that shows the largest speedup factors also 
corresponds to needing no Krylov iterations for each Newton iteration (as the authors point out, the precon- 
ditioner itself is providing a very good initial guess). In that case, the algorithm of reference becomes 
similar to the one presented here, (except for differences in the predictor step and in the treatment of dissi- 
pation terms), so one would in fact expect comparable performance. But short of more detailed comparisons, 
one should simply conclude that the Iterative Semi-Implicit method developed here seems competitive, at 
least for some problems, with these somewhat more complex approaches. The Iterative Semi-Implicit al- 
gorithm has the relative advantage of being simpler to implement (particularly for codes that arc already 
semi-implicit) and works well for the type of problem we have studied here, while a full JFNK algorithm 
would be more robustly convergent and may be more efficient for other types of problems where the present 
algorithm needs more than a few iterations. 

With regards to the implementation of a similar treatment of the diffusive terms in real space codes, 
we begin by noting that, in the final formulation of the numerical scheme equations ([2711301) . what has 
to be evaluated are operators like e~-^^*. One way to evaluate this with a real-space code could be with 
a variation of Exponential Propagation Iterative methods Another way would be to use a 2cd-order 
rational- function approximation 

e^'^^* « (1 + aVAt){l + bVAt)-\l + bVAt)-\ 

with a = 1 + \/2, & = (2 + a/2)/2. This retains the second order accuracy of a Crank-Nicolson treatment 
of damping while having the advantage of robust damping in the large VAt limit (unlike Crank-Nicolson 
where e"^^* « (1-1- VAt/2)-^{l - VAt/2) -1), and stiU remaining relatively easy to evaluate, since 
inverting the (1 + b'DAt) operator is equivalent to a standard implicit step. Note, however, that this has not 
been tested by us in this work. 

Our code works in Fourier-space, where it is very easy to invert the SI operator presented in sec- 
tion 13.21 but it should also be possible to implement a very similar SI operator in real space. Using a 
Pade approximation for the Fo function and Fourier-transforming to real space, equation (|72p becomes 
= —a^B\ raax^\ ~ Pt^I)- Thc resulting SI operator (1 -I- At^t2;^/4) is a symmetric positive-definite 
parabolic operator that could be solved with a preconditioned iterative Krylov solver or multigrid algorithm. 
(In some cases, a direct sparse-matrix solver can also be used.) It should be noted that if a Krylov solver is 
used, it is important to use a very good prcconditioncr. For these types of parabolic problems without pre- 
conditioning, the number of iterations required for Krylov solvers like Conjugate Gradients or GMRES scales 
with the square root of the condition number of the matrix 47, 4^, Nuers oc a/1 -I- At'^oj'^^^/ A oc AtuJmax, 
which would give it at most an order unity speed up over an explicit code. To get a significant speed up, an 
iterative solver must be aided by further algebraic preconditioning, such as Modified-ILU [i^, multigrid, or 
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multilevel additive Schwarz [14| . or some combination thereof. Again, we note that this has not been tested 
by us: in our application, direct inversion of the operator (|72p is trivial since we use a pseudo-spectral code. 

An important observation from the physics point of view is that these results, i.e., the fact that the KAW 
can be integrated implicitly without affecting the physics of the system, confirm the parasitic role of this 
wave in the reconnection process, although the terms from which they arise are essential to the observed 



speed up of the instability growth rate. A similar conclusion has been reported by Chacon and Knoll 11 1 
for the whistler wave. 

Finally, we note that this numerical method could be useful for other problems where an implicit treatment 
of high-frequency waves is desired, and in particular is trivially applicable to fluid studies of coUisionless 
reconnection, where electron inertia is added to equations ([57H59|) and replaces the role played by the 
resistivity in breaking the frozen flux condition. 
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Appendix A. Exact integration of the diffusive exponential in equations ([THH]) 

It is instructive to consider one logical alternative to the discretization of equations ([THH]) performed in 
equations ([9lll0p. For simplicity, let us instead consider the more compact form of the problem given by 
equation (j45p . A formal solution to this equation can be written as: 

^n+i = ^" + f dt e^^'Tii^). (A.l) 



In the Crank-Nicolson algorithm of equations ([QHlOl) , what is done is to approximate the integral with a 2cd 
order midpoint evaluation, i.e.: 

Ctn + l 

dt e^^*T{il^) « — [e^"*".F(i^") + e^"*"+i.F(i^"+i)] , (A.2) 

from which formulas of the kind of (|9ll0p directly follow. Instead, one can exactly integrate the exponential 
term, while using the CN discretization for the T operator alone. Undoing the variable substitutions of ([6]), 
we obtain: 

^n+l ^ ^-V,At^,n ^ ^ (■ j + ^(^,"+1)] . (A.3) 

Now following a procedure similar to that of section 12. 5i it can be shown that this method reduces to 
equation (|52p in the limit of ^At ^ 1, V^At ^ 1, i.e., it is also second-order accurate even if the operators 
and do not commute. It does not, however, exhibit the property of robust damping discussed in 
section 12.61 as can easily be seen by evaluating the above expression in the limit I?,,Ai ^ 1 (but it does 
correctly treat the case of J-{ip) = const. ^ which our proposed algorithm does not, see appendix B. A method 
to accurately deal with both cases in the limit of V^jAt ^ 1 is left to a future publication). 



Yet another second-order accurate alternative is Strang-splitting [5J]. For the model problem equation (|55|) . 
one form of Strang-splitting is V""*"^ = e-^''^*/2(x _^ ia;Ai/2)"^(l - iwAt/2)e-^'''^*/2-0". Similarly to our 
numerical scheme of equations ()27H30p . this also exhibits robust-damping. However, we found this method to 
be much less efficient in terms of the timestep enhancements that can be obtained over the CFL condition. 
Further tests would be required to understand why our splitting scheme is a better alternative than Strang- 
splitting for this problem; the interested reader is referred to a recent paper by Kozlov et al. jsH ] on the 
nature of the local error for splitting methods applied to stiff problems. 
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Appendix B. Inclusion of the equilibrium source term in equations (|27H30|) 



The equilibrium source term on the RHS of equation ([55]) . E^xt = T^rjipeq, is not explicitly accounted for 
in the derivation of the numerical method of equations ((27l - [30)) . Since ^ is a general operator, it can, of 
course, include this term. A more accurate alternative is obtained using the fact that d4>eq/dt ~ 0. Thus, 
equation (|58p can be recast in terms of V'l = V' ~ V'eg in the form 

^ = (B.l) 



Now defining = e ^'''■(/'i we obtain 

^"^1 -e^''*.F(0,ne,^), (B.2) 



dt 

which is of the same form as equation ([7]) and so the derivation of the method can be carried out as in 
section [53] Equations (|28i[5^ are thus replaced by: 

^n+l.* ^ e-C„At^„ ^ _ g-I?.At^ 

+ ^(l + e-^''^*).F(0",V"), (B.3) 

A/ Lj^Af^ - 
+_i;r(^»+i,p^^»+i,p) _ ^»+i.P). (B.4) 
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